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The interaction trap systems of the prior art are based on the finding that most 
eukaryotic transcription activators are modular. Brent and Ptashnc showed that the 
activation domain of yeast GAL4, a yeast transcription factor, could he fused to the DNA 
binding domain of E. coli LexA to create a functional transcription activator in yeast (Brent 
5 et al. (1985) Cell 43:729-736). There is evidence that transcription can be activated through 
the use of two functional domains of a transcription factor: a domain that recognizes and 
binds to a specific site on the DNA and a domain that is necessary for activation. The 
transcriptional activation domain is thought to function by contacting other proteins 
involved in transcription. The DNA-binding domain appears to function to position the 

10 transcriptional activation domain on the target gene that is to be transcribed. These and 
similar experiments (Kcegan et al. (1986) Science 231:699-704) formally define activation 
domains as portions of proteins that activate transcription when brought to DNA by DNA 
binding domains. Moreover, it was discovered that the DNA binding domain docs not have 
to be physically on the same polypeptide as the activation domain, so long as the two 

15 separate polypeptides interact with one another. (Ma et al. ( 1 988) Cell 55:443-446). 

Fields and his coworkers made the seminal suggestion that protein interactions could 
be detected if two potentially interacting proteins were expressed as chimeras. In their 
suggestion, they devised a method based on the properties of the yeast Gal4 protein, which 
consists of separable domains responsible for DNA-binding and transcriptional activation. 

20 Polynucleotides encoding two hybrid proteins, one consisting of the yeast Gal4 DNA- 
binding domain fused to a polypeptide sequence of a known protein and the other consisting 
of the Gal4 activation domain fused to a polypeptide sequence of a second protein, are 
constructed and introduced into a yeast host cell. Intermolecular binding between the two 
fusion proteins reconstitutes the Gal4 DNA-binding domain with the Gal4 activation 

25 domain, which leads to the transcriptional activation of a reporter gene (e.g., lacZ, HIS3) 
which is operably linked to a Gal4 binding site. 

All yeast-based interaction trap systems in the art share common elements (Chien et 
al. (1991) PNAS 88:9578-82; Durfee et al. (1993) Genes & Development 7:555-69; Gyuris 
et al. (1993) Cell 75:791-803; and Vojtek ct al. (1993) Cell 74:205-14). All use (1) a 
30 plasmid that directs the synthesis of a "bait": a known protein which is brought to DNA by 
being fused to a DNA binding domain, (2) one or more reporter genes ( M reporters M ) with 
upstream binding sites for the bait, and (3) a plasmid that directs the synthesis of proteins 
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fused to activation domains and other useful moieties ("prey"). All current systems direct 
the synthesis of proteins that carry the activation domain at the amino terminus of the 
fusion, facilitating the expression of open reading frames encoded by, for example, cDNAs. 

The prior art systems differ in their specifics. These details are typically relevant to 
5 their successful use. Baits differ in their DNA binding domains. For example, systems use 
baits that contain native £ colt LexA repressor protein (Durfee et al. (1993) Genes & 
Development 7:555-69; Gyuris et al. (1993) Cell 75:791- 803). LexA binds tightly to 
appropriate operators (Golemis et al. (1992) Mol Cell. Biol 12:3006-3014; Ebina et al. 
(1983) J. Biol. Chem. 258:13258-13261), and carries a dimcrization domain at its C 

10 terminus (Brent R. (1982) Biochimie 64:565-569; Little J et al. (1982) Cell 29:1 1-22; and 
Thliveris et al. (1991) Biochime 73:449-455). In yeast, LexA and most LexA derivatives 
enter the nucleus, but are not necessarily nuclear localized. Others use baits that contain a 
portion of the yeast GAL4 protein (Chien et al. (1991) PNAS 88:9578-82; Durfee et al. 
(1993) Genes & Development 7:555-69; and Harper et al. (1993) Cell 75:805-16). This 

15 portion, encoded by residues 1-147, is sufficient to bind tightly to appropriate DNA binding 
sites, localize fused proteins to the nucleus, and direct dimerization; it also contains a 
domain that weakly activates transcription from mammalian cell extracts in vitro, and it is 
thus conceivable that this domain may increase transcription resulting from weakly 
interacting proteins. 

20 Reporter genes differ in the phenotypes they confer. The products of some reporter 

genes (e.g., HIS3, LEU2) allow cells expressing them to be selected by growth on 
appropriate media, while the products of others (e.g. lacZ) allow cells expressing them to be 
visually screened. Reporters also differ in the number and affinity of upstream binding sites 
(e.g., lexA operators) for the bait, and in the position of these sites relative to the 

25 transcription startpoint (Gyuris et al., supra). Finally, they differ in the number of molecules 
of the reporter gene product necessary to score the phenotype. These differences affect the 
strength of the protein interactions the reporters can detect . 

Preys differ in the activation domains they carry, and in whether they contain other 
useful moieties such as nuclear localization sequences and epitope tags. Some activation 
30 domains are stronger than others. Although strong activation domains should allow 
detection of weaker interactions, their expression can also harm the cell due to poorly 
understood transcriptional effects, either by titration of cofactors necessary for transcription 
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Summary of the Invention 

The present invention provides methods and reagents for practicing various forms of 
an interaction trap assay using prokaryotic host cells, e.g., bacterial cells. 

For example, one aspect of the present invention relates to a method for detecting 
interaction between a first test polypeptide and a second test polypeptide. The method 
comprises a step of providing an interaction trap system including a prokaryotic host cell 
which contains a reporter gene operably linked to a transcriptional regulatory sequence 
which includes a binding site ("DBD recognition element") for a DNA-binding domain. 
The cell is engineered to include a first chimeric gene which encodes a first fusion protein, 
the first fusion protein including a DNA-binding domain and first test polypeptide. The cell 
also includes a second chimeric gene which encodes a second fusion protein including an 
activation tag (such as a polymerase interaction domain (PID]) which activates transcription 
of the reporter gene when localized to the vicinity of the DBD recognition element. 
Interaction of the first fusion protein and second fusion protein in the host cell results in 
measurably greater expression of the reporter gene. Accordingly, the method also includes 
the steps of measuring expression of the reporter gene, and comparing the level of 
expression of the reporter gene to a level of expression in a control interaction trap system 
in which one of both of the first and second test polypeptides are missing from the first and 
second fusion proteins and resulting fusion proteins do not interact. A statistically 
significant increase in the level of expression is indicative of an interaction between the first 
and second test polypeptide portions of the fusion proteins. 

Another aspect of the present invention relates to a kit for detecting interaction 
between a first test polypeptide and a second test polypeptide. The kit can include a first 
vector for encoding a first fusion protein ("bait fusion protein"), which vector comprises a 
first gene including (1) transcriptional and translational elements which direct expression in 
a prokaryotic host cell, (2) a DNA sequence that encodes a DNA-binding domain and which 
is functionally associated with the transcriptionai and translational elements of the first 
gene, and (3) a means for inserting a DNA sequence encoding a first test polypeptide into 
the first vector in such a manner that the first test polypeptide is capable of being expressed 
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in-frame as part of a bait fusion protein containing the DNA binding domain. The kit will 
also include a second vector for encoding a second fusion protein ("prey fusion protein"), 
which comprises a second gene including (1) transcriptional and translational elements 
which direct expression in a prokaryotic host cell, (2) a DNA sequence that encodes a 
5 activation tag, such as a polymerase interaction domain (PID), the activation tag DNA 
sequence being functionally associated with the transcriptional and translational elements of 
the second gene, and (3) a means for inserting a DNA sequence encoding the second test 
polypeptide into the second vector in such a manner that the second test polypeptide is 
capable of being expressed in-framc as part of a prey fusion protein containing the 

10 polymerase interaction domain. Additionally, the kit will include a prokaryotic host cell 
containing a reporter gene having a binding site ("DBD recognition element") for the DNA- 
binding domain, wherein the reporter gene expresses a detectable protein when a prey 
fusion protein interacts with a bait fusion protein bound to the DBD recognition element; 
the host cell being incapable of expressing a protein having the function of (a) the first 

15 marker gene, (b) the second marker gene, (c) the DNA-binding domain, and (d) the 
polymerase interaction domain. Binding of the first test polypeptide and the second test 
polypeptide in the host cell results in measurably greater expression of the reporter gene 
than the simultaneous presence of the DNA-binding domain and the polymerase interaction 
domain in the absence of an interaction between the first test polypeptide and the second 

20 test polypeptide. 

Other features and advantages of the invention will be apparent from the following 
detailed description, and from the claims. The practice of the present invention will 
employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, 
molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, 

25 which arc within the skill of the art. Such techniques are explained fully in the literature. 
Sec, for example, Molecular Cloning A Laboratory Manual, 2nd Ed., ed. by Sambrook, 
Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes 
1 and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984); Mullis et 
al. U.S. Patent No. 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins 

30 eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); 
Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And 
Enzymes (1RL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the 
treatise. Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For 
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Mammalian Cells (J. II. Miller and M. P. Calos eds.. 1987, Cold Spring Harbor 
Laboratory); Methods In Enzymology. Vols. 154 and 155 {Wu ct al. eds.). Immunochemical 
Methods- In Cell And Molecular Biology> (Mayer and Walker, eds.. Academic Press. London, 
1987); Handbook Of Experimental Immunology, Volumes 1-1 V (D. M. Weir and C. C. 
5 BlackwelK eds., 1986); Manipulating (he Mouse Embryo. (Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, N.Y., 1986). 

Brief Description of the Figures 

Figure 1A illustrates that \c\ binds DNA as a dimer, and pairs of dimers bind 
10 cooperatively to adjacent operator sites. 

Figure IB illustrates the transcriptional complexes which may formed with a prey 
fusion protein resulting from replacement of the ct-CTD (C-lerminal domain) with the Xcl- 
CTD. As described in the appended examples, the hybrid a gene was generated by 
replacing the gene segment encoding the a-CTD with a gene segment encoding the Xcl- 
15 CTD. A derivative of the lac promoter was also created bearing a single X operator (O r 2) 
in place of the CRP-binding site (centered 62 bps upstream of the transcription startpoint). 

Figure 2A illustrates the transcriptional complexes which may formed with a prey 
fusion protein resulting from replacement of the a-CTD with the GAL 1 1 p and a bait protein 
comprised of the Xc\ protein having GAL4 fused at its C-terminus. 

20 Figure 2B is a graph indicating the ability of various fusion proteins of GAL 1 1 and 

GAL1 1 p to function in the subject ITS. 

Figure 3A depicts the presence of the o subunit in E. coli RNA polymerase 
complexes. 

Figure 3B illustrates a covalent system for the (o subunit in a Xcl-<a fusion protein. 

25 Figure 3C is a graph indicating the ability of the tel-co fusion protein to drive 

expression of a reporter gene having a XcFoperator. 

Figure 3D an ITS using the co subunit in a GAL I l p -<o fusion protein. 

Figure 3E is a graph showing that co-expression of the GAL1 l p -o fusion protein 
with a Ax:I-GAL4 fusion protein can activate the expression of a reporter gene under the 
30 transcriptional control of a Xcl operator. 
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Figure 4 is a table illustrating the relative level of reporter gene expression with 
various combinations of prey and bait fusion proteins derived with p53 sequences. 

Detailed Description of the Invention 

5 The eukaryotic interaction trap system ("ITS"), originally developed by Fields and 

Song {Nature (1989) 340:245) in yeast, is a powerful in vivo assay to detect protein-protein 
interactions. It has already had a large impact on basic and applied biological research. In 
industry, it is being used to isolate and characterize new targets for drug development. It 
permits researchers to isolate small organic molecules, peptides, and nucleic acids that may 

10 lead to new drugs. Future applications for genome characterization and for modulation of 
specific protein-protein interactions are on the horizon. The ramifications of this technology 
promise to be exciting. In this system, one protein is fused to a DNA binding domain, while 
the other is fused to a transcriptional activating domain. If the two proteins interact in a 
yeast cell, a functional transcriptional activator is reconstituted, the activity of which is 

15 monitored by the expression of a reporter gene containing a cognate site for the DNA 
binding domain. A number of different DNA binding domains and activation domain have 
been successfully used in this system, as well as a variety of different reporter genes. 
However, the interaction trap assays, described in the art have only been generated in 
eukaryotic cells. There are no examples in the art of an analogous system being generated 

20 in prokaryotes. 

The present invention makes available an interaction trap system (hereinafter "ITS") 
which is derived using recombinantly engineered prokaryotic cells. As described in the 
appended examples, the prokaryotic ITS derives in part from the unexpected finding that the 
natural interaction between a transcriptional activator and subunit(s) of an RNA polymerase 
25 complex can be replaced by a heterologous protein-protein interaction which is capable of 
activating transcription. The versatility of the prokaryotic ITS makes it generally suitable 
for many, if not all of the applications of the eukaryotic ITS. Moreover, the ease of 
manipulation of the bacterial cells, e.g., in transformation or transfection and culturing, 
means that even larger polypeptide libraries can be sorted in the prokaryotic ITS. 

30 The prokaryotic interaction trap systems described herein provide advantages over 

the conventional eukaryotic ITS methods. For example, the use of bacterial host cells to 
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transcriptional regulatory sequences of the reporter gene with concomitant transcription of 
the reporter gene. The method is carried out by introducing the first and second chimeric 
genes into the host cell, and subjecting that cell to conditions under which the first and 
second hybrid proteins arc expressed in sufficient quantity for expression of the reporter 
5 gene to be activated by interaction of the two fusion proteins if that interaction occurs. The 
formation of a complex between the bait and prey fusion proteins results in a detectable 
signal produced by the expression of the reporter gene. Accordingly, the formation of a 
complex between a sample target protein and proteins encoded by a cDNA library, for 
example, can be detected, and ITS cells isolated, if desired, on the basis of evaluating the 
10 level of expression of the reporter gene. 

The method of the present invention, as described above, may be practiced using a 
kit for detecting interaction between a first test protein and a second test protein. The kit 
typically will include the two vectors for generating the chimeric proteins, a reporter gene 
construct, and a host cell. The first vector contains a promoter and may include a 

15 transcription termination signal functionally associated with the first chimeric gene in order 
to direct the transcription of the first chimeric gene. The first chimeric gene includes a 
DNA sequence that encodes a DNA-binding domain and a (unique) restriction site(s) for 
inserting a DNA sequence encoding a first test polypeptide in such a manner that the first 
test protein is expressed as part of a hybrid protein with the DNA-binding domain. The first 

20 vector also includes a means for replicating itself in the host cell. Also included on the first 
vector is, preferably, a first marker gene, the expression of which in the host cell permits 
selection of cells containing the first marker gene. Exemplary marker genes confer 
antibiotic resistance. Preferably, the first vector is a plasmid. 

The second vector is derived for generating the second chimeric protein. The 
25 second chimeric gene includes a promoter and other relevant transcription and/or translation 
sequences to direct expression of the chimeric gene. The second chimeric gene also 
includes a DNA sequence that encodes an activation tag and a (unique) restriction site(s) to 
insert a DNA sequence encoding the second test polypeptide into the vector, in such a 
manner that the second test protein is capable of being expressed as part of a hybrid protein 
30 with the activation tag. The second vector further includes a means for replicating itself in 
the host cell. The second vector also includes a second marker gene, the expression of 
which in the host cell permits selection of cells containing the second marker gene. 
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The kit includes a prokaryotic host cell, preferably a strain of E. coli or other 
suitable bacterial strain, which can be engineered to express the bait and prey fusion 
proteins, and express the reporter gene in a manner dependent on the formation of 
complexes including the two fusion proteins. The host cell contains the reporter gene 
having a DNA binding site for the DNA-binding domain of the first hybrid protein. The 
binding site is positioned so that, upon interaction of the bait and prey fusion proteins, an 
RNA polymerase complex is recruited to the promoter sequence of the reporter gene, 
causing expression of the reporter gene. The host cell, by itself, is preferably incapable of 
expressing a protein having a function of the first marker gene, the second marker gene, the 
reporter gene, or the complex of the prey and bait fusion proteins. 

Accordingly, in using the kit the interaction of the bait and prey components of the 
two fusion proteins in the host cell causes a measurably greater expression of the reporter 
gene than when the DNA-binding domain and the polymerase interaction domain are 
provided alone, e.g., without one or both of the bait or prey polypeptides. The reporter gene 
may encode an enzyme or other product that can be readily measured. Such measurable 
activity may include the ability of the cell to grow only when the marker gene is 
transcribed, or the presence of detectable enzyme activity only when the marker gene is 
transcribed. 

The cells containing the two hybrid proteins are incubated in/on an appropriate 
medium and the cells are monitored, and optionally selected, by detecting expression of the 
reporter gene product. Expression of the reporter gene is an indication that the bait protein 
and the prey protein have interacted. 

//. Definitions 

Before further description of the invention, certain terms employed in the 
specification, examples and appended claims are, for convenience, collected here. 

The term "prokaryote" is art recognized and refers to a unicellular organism lacking 
a true nucleus and nuclear membrane, having genetic material composed of a single loop of 
naked double-stranded DNA. Prokaryotcs with the exception of mycoplasmas have a rigid 
cell wall. In some systems of classification, a division of the kingdom Prokaryotae, 
Bacteria include all prokaryotic organisms that are not blue-green algae (Cyanophyceae). In 
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other systems, prokaryotic organisms without a true cell wall are considered to be unrelated 
to the Bacteria and are placed in a separate class—the Mollicutes. 

The term "bacteria" is art recognized and refers to certain single-celled 
microorganisms of about 1 micrometer in diameter; most species have a rigid cell wall. 
5 They differ from other organisms (eukaryotes) in lacking a nucleus and membrane-bound 
organelles and also in much of their biochemistry. 

As used herein, "recombinant cells" include any cells that have been modified by the 
introduction of heterologous DNA. 

As used herein, the terms "heterologous DNA" or "heterologous nucleic acid" is 
10 meant to include DNA that docs not occur naturally as part of the genome in which it is 
present, or DNA which is found in a location or locations in the genome that differs from 
that in which it occurs in nature, or occurs extra-chromasomally, e.g., as part of a plasmid. 

By "protein" or "polypeptide" is meant a sequence of amino acids of any length, 
constituting all or a part of a naturally-occurring polypeptide or peptide, or constituting a 
15 non-naturally-occurring polypeptide or peptide (e.g., a randomly generated peptide 
sequence or one of an intentionally designed collection of peptide sequences). 

By a "DNA binding domain" or "DBD" is meant a polypeptide sequence which is 
capable of directing specific polypeptide binding to a particular DNA sequence (i.e., to a 
DBD recognition element). The term "domain" in this context is not intended to be limited 
20 to a discrete folding domain. Rather, consideration of a polypeptide as a DBD for use in the 
bait fusion protein can be made simply by the observation that the polypeptide has a 
specific DNA binding activity. DNA binding domains, like activation tags, can be derived 
from proteins ranging from naturally occurring proteins to completely artificial sequences. 

The term "activation tag" refers to a polypeptide sequence capable of affecting 
25 transcriptional activation, for example assembling or recruiting an active polymerase 
complex. For instance, in the prokaryotic ITS the activation tag can be a polymerase 
interaction domain or some other polypeptide sequence which interacts with, or is 
covalently bound to, one or more subunits (or a fragment thereof) of an RNA polymerase 
complex. Activation tags can also be sequences which are derived from, e.g., transcription 
30 factors or auxiliary proteins of polymerase complexes or even from random polypeptide 
libraries . 



/ 
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The term "polymerase interaction domain" or "PID" are activation tags which 
include determinants of an RNA polymerase subunit that mediate its interaction with other 
polymerase subunits, or a polypeptide sequence which interacts with, or is covalently bound 
to, one or more subunits (or a fragment thereof) of an RNA polymerase complex. 

5 The terms "recombinant protein", "heterologous protein" and "exogenous protein" 

are used interchangeably throughout the specification and refer to a polypeptide which is 
produced by recombinant DNA techniques, wherein generally, DNA encoding the 
polypeptide is inserted into a suitable expression vector which is in turn used to transform a 
host cell to produce the heterologous protein. That is, the polypeptide is expressed from a 
10 heterologous nucleic acid. 

As used herein, a "reporter gene construct" is a nucleic acid that includes a "reporter 
gene" operatively linked to transcriptional regulatory sequences. Transcription of the 
reporter gene is controlled by these sequences. The activity of at least one or more of these 
control sequences is directly or indirectly regulated by a transcriptional complex recruited 

15 by virtue of interaction between the bait and prey fusion proteins. The transcriptional 
regulatory sequences can include a promoter and other regulatory regions that modulate the 
activity of the promoter, or regulatory sequences that modulate the activity or efficiency of 
the RNA polymerase that recognizes the promoter. Such sequences are herein collectively 
referred to as transcriptional regulatory elements or sequences. The reporter gene construct 

20 will also include a "DBD recognition clement" which is a nucleotide sequence that is 
specifically bound by the DNA binding domain of the bait fusion protein. The DBD 
recognition element is located sufficiently proximal to the promoter sequence of the reporter 
gene so as to cause increased reporter gene expression upon recruitment of an RNA 
polymerase complex by a bait fusion protein bound at the recognition element. 

25 As used herein, a "reporter gene" is a gene whose expression may be assayed; 

reporter genes may encode any protein that provides a phenotypic marker, for example: a 
protein that is necessary for cell growth or a toxic protein leading to cell death, e.g., a 
protein which confers antibiotic resistance or complements an auxotrophic phenotype; a 
protein detectable by a colorimetric/fluorometric assay leading to the presence or absence of 

30 color/fluorescence; or a protein providing a surface antigen for which specific 
antibodies/ligands are available. 
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By "operably linked" is meant that a gene and transcriptional regulatory sequcnce(s) 
arc connected in such a way as to permit expression of the gene in a manner dependent upon 
factors interacting with the regulatory sequence(s). In the case of the reporter gene, the 
DBD recognition element will also be operably linked to the reporter gene such that 
transcription of the reporter gene will be dependent, at least in pan, upon bait-prey 
complexes bound to the recognition element. 

By "covalently bonded" it is meant that two domains arc joined by covalent bonds, 
directly or indirectly. That is, the "covalently bonded" proteins or protein moieties may be 
immediately contiguous or may be separated by stretches of one or more amino acids within 
the same fusion protein. 

By "altering the expression of the reporter gene" is meant a statistically significant 
increase or decrease in the expression of the reporter gene to the extent required for 
detection of a change in the assay being employed. It will be appreciated that the degree of 
change will vary depending upon the type of reporter gene construct or reporter gene 
expression assay being employed. 

The terms " interactors", "interacting proteins" and "candidate interactors" are used 
interchangeably herein and refer to a set of proteins which are able to form complexes with 
one another, preferably non-covalent complexes. 

By "test protein" or "test polypeptide" is meant all or a portion of one of a pair of 
interacting proteins provided as part of the bait or prey fusion proteins. 

By "randomly generated" is meant sequences having no predetermined sequence; 
this is contrasted with "intentionally designed" sequences which have a DNA or protein 
sequence or motif determined prior to their synthesis. 

By "amplification" or "clonal amplification" is meant a process whereby the density 
of host cells having a given phenotype is increased. 

The terms "pool" of polypeptides, "polypeptide library" or "combinatorial 
polypeptide library" are used interchangeably herein to indicate a variegated ensemble of 
polypeptide sequences, where the diversity of the library may result from cloning or be 
generated by mutagenesis. The terms "pool" of genes , "gene library" or "combinatorial 
gene library" have a similar meaning, indicating a variegated ensemble of nucleic acids. 



/ 
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By "screening" is meant a process whereby a gene library is surveyed to determine 
whether there exists within this population one or more genes which encode a polypeptide 
having a particular binding characteristic in the interaction trap assay. 

It is further noted that the following description of particular arrangements of test 
5 polypeptide sequences in terms of being part of the bail or prey fusion proteins is, in 
general, arbitrary. As will be apparent from the description, the test polypeptide portions of 
any given pair of interacting bait and prey fusion proteins may ordinarily be swapped with 
one another. 

Each component of the system is now described in more detail. 

10 

///. Bait protein constructs 

One of the first steps in the use of the interaction trap system of the present 
invention is to construct the bait fusion protein. To do this, sequences encoding a protein of 
interest or a polypeptide library are cloned in-frame to a sequence encoding a DNA binding 

15 domain (DBD), e.g., a polypeptide which specifically binds to a defined nucleotide 
sequence. Those skilled in the art will appreciate from the present disclosure that there are a 
wide variety of DNA binding domains that can be used to construct the bait fusion protein, 
including polypeptides derived from naturally occurring DNA binding proteins, as well as 
polypeptides derived from proteins artificially engineered to interact with specific DNA 

20 sequences. Basic requirements for the bait fusion protein include the ability to specifically 
bind a defined nucleotide sequence, and (preferably) that the bait fusion protein cause little 
or no transcriptional activation of the reporter gene in the absence of an interacting prey 
fusion protein. In addition, the bait polypeptide sequence should not affect the ability of the 
DBD to bind to its cognate sequence in the transcriptional regulatory element of the reporter 

25 gene. 

In one preferred embodiment, the DBD portion of the bait fusion protein is derived 
using all, or a DNA binding portion of a transcriptional regulatory protein, e.g., of either a 
transcriptional activator or transcriptional repressor, which retains the ability to selectively 
bind to particular nucleotide sequences. The DNA binding domains of the bacteriophage 
30 Xcl protein (hereinafter "/Id") and the E. coli LcxA repressor (hereinafter "LcxA") represent 
preferred DNA binding domains for the bait fusion proteins of the instant interaction trap 
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system. The use of a well-defined system, such as Axl or LexA, allows knowledge 
regarding the interaction between a DNA binding domain and its DBD recognition element 
(i.e., the Xcl or LexA operator) to be exploited for the purpose of optimizing operator 
occupancy and/or optimizing the geometry of the bound bait protein to effect maximal gene 
activation. In constructing the bait fusion protein, the DNA binding activity of the fusion 
protein can be, as appropriate, provided by using all or a portion of the transcriptional 
regulatory protein. Depending on the sequences of the regulatory protein retained in the 
bait fusion protein, it may be desirable to mutate certain residues of those retained 
sequences which may contribute to transcriptional activation or repression in the absence of 
the prey fusion protein, e.g., in order to reduce prey-independent modulation of reporter 
gene transcription. 

However, any other transcriptionally inert or essentially transcriptional iy-incrt DNA 
binding domain may be used to create the bait fusion protein in the instant interaction trap 
system; such DNA binding domains are well known and include, but are not limited to such 
motifs as helix-turn-helix motifs (such as found in Xcl), winged helix-turn helix motifs 
(such as found in certain heat shock transcription factors), and/or zinc Fingers/zinc clusters. 
As merely illustrative, the bait fusion protein can be constructed utilizing the DNA binding 
portions of the LysR family of transcriptional regulators, e.g., Trpl, HvY, OccR, OxyR, 
CatR, NahR, MetR, CysB, NodD or SyrM (Schell ct al. (1993) Annu Rev Microbiol 
47:597), or the DNA binding portions of the PhoB/OmpR-related proteins, e.g., PhoB, 
OmpR, CacC, PhoM, PhoP, ToxR, VirG or SfrA (Makino et al. ( 1 996) J Mol Biol 259: 1 5), 
or the DNA binding portions of histones HI or H5 (Suzuki el al. (1995) FEBS Lett 
372:215). Other exemplary DBD's which can be used to generate the bait fusion protein 
include DNA binding portions of the P22 Arc repressor, Met!, CENP-B, Rap I, 
Xy 1 S/Ada/AraC, Bir5 or DtxR. 

Furthermore, the DNA binding domain need not be obtained from the protein of a 
prokaryote. For example, polypeptides with DNA binding activity can be derived from 
proteins of eukaryotic origin, including from yeast. For example, the DBD portion of the 
bait fusion protein can include polypeptide sequences from such eukaryotic DNA binding 
proteins as p53, jun, fos, GCN4, or GAL4. Likewise, the DNA binding portion of the bait 
fusion protein can be generated from viral proteins, such as the pappiliomavirus E2 protein 
(c.f., PCT publication WO 96/19566). In yet other embodiments, the DNA binding protein 
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can be generated by combinatorial mutagenic techniques, and represent a DBD not naturally 
occurring in any organism. A variety of techniques have been described in the art for 
generating novel DNA binding proteins which can selectively bind to a specific DNA 
sequence (c.f, U.S. Patent 5,198,346 entitled "Generation and selection of novel DNA- 
binding proteins and polypeptides"). 

As appropriate, the DNA binding motif used to generate the bait fusion protein can 
include oligomerization motifs. As known in the art, certain transcriptional regulators 
dimerizc, with dimcrization promoting cooperative binding of the two monomers to their 
cognate recognition elements. For example, where the bait protein includes a LexA DNA 
binding domain, it can further include a LexA dimcrization domain; this optional domain 
facilitates efficient LexA dimer formation. Because LexA binds its DNA binding site as a 
dimcr, inclusion of this domain in the bait protein also optimizes the efficiency of operator 
occupancy (Golemis and Brent, (1992) Mol. Cell Biol. 12:3006). Other oligomerization 
motifs useful in the present invention will be readily recognized by the those skilled in the 
art. Exemplary motifs include the tetramerization domain of p53 and the tetramerization 
domain of BCR-ABL. In addition, the art also provides a variety of techniques for 
identifying other naturally occurring oligomerization domains, as well as oligomerization 
domains derived from mutant or otherwise artificial sequences. See, for example, Zeng et al. 
(1997) Gene 185:245. 

As described below, binding efficiency of the bait fusion protein for the recognition 
clement of the reporter gene can also be fine tuned by the particular sequence of the DBD 
recognition element, and its proximity to other transcriptional regulatory sequences in the 
reporter gene construct. Likewise, the binding efficiency and/or specificity of the DBD 
portion of the bait fusion protein can be altered by mutagenesis. 

The bait portion of the bait fusion protein may be chosen from any protein of interest 
and includes proteins of unknown, known, or suspected diagnostic, therapeutic, or 
pharmacological importance. Exemplary bait proteins include, but are not limited to, 
oncoproteins (such as myc, particularly the C-terminus of myc, ras, sre, fos, and particularly 
the oligomeric interaction domains of fos), tumor-suppressor proteins (such as p53, Rb, 
INK4 proteins [pl6INK4a, pl5INK4b], CIP/KIP proteins [p21CIPl, p27KIPl]) or any 
other proteins involved in cell-cycle regulation (such as kinases and phosphatases). In other 
embodiments, the bait polypeptide can be generated using all or a portion of a protein 
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involved in signal transduction, including such motifs as SII2 and SH3 domains, ITAMs, 
ITIMs, kinase, phospholipase, or phosphatase domains, cytoplasmic tails of receptors and 
the like. Yet other preferred bait fusion proteins are generated with cytoskclctal proteins or 
factors involved in transcription or translation, or portions thereof. Still other bait fusion 
5 proteins can be generated with viral proteins. 

In preferred embodiments, where the bait protein includes a catalytic domain of an 
enzyme, the fusion protein is derived with a catalytically inactive mutant, most preferably a 
mutant which binds substrate with about the K m of the wild-type enzyme but with a greatly 
diminished K cat for the catalyzed reaction with the substrate. For example, mutation of a 

10 residue in the catalytic site of the enzyme can give rise to such catalytically inactive 
mutants. Particular examples include point mutation of the active site lysine of a kinase, the 
active site serine of a serine protease or the active site cysteine of a phosphatase. Thus, the 
binding of the bait polypeptide portion of the fusion protein to a polypeptide substrate 
presented by a prey fusion protein can be enhanced. In each case, the protein of interest is 

15 fused to a DNA binding domain as generally described herein. 

The use of recombinant DNA techniques to create a fusion gene, with the 
translational product being the desired bait fusion protein, is well known in the art. 
Essentially, the joining of various DNA fragments coding for different polypeptide 
sequences is performed in accordance with conventional techniques, employing blunt-ended 

20 or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate 
termini, filling in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid 
undesirable joining, and enzymatic ligation. Alternatively, the fusion gene can be 
synthesized by conventional techniques including automated DNA synthesizers. In another 
method, PCR amplification of gene fragments can be carried out using anchor primers 

25 which give rise to complementary overhangs between two consecutive gene fragments 
which can subsequently be annealed to generate a chimeric gene sequence (see, for example. 
Current Protocols in Molecular Biology . Eds. Ausubel et al. John Wiley & Sons: 1992). 

It may be necessary in some instances to introduce an unstructured polypeptide 
linker region between the DNA binding domain of the fusion protein and the bait 
30 polypeptide sequence. Where the bait fusion protein also includes oligomerization 
sequences, it may be preferable to situate the linker between the oligomerization sequences 
and the bait polypeptide. The linker can facilitate enhanced flexibility of the fusion protein 
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allowing the DBD to freely interact with a responsive element, and, if present, the 
oligomerizaiion sequences to make inter-protein contacts. The linker can also reduce steric 
hindrance between the two fragments, and allow appropriate interaction of the bait 
polypeptide portion with a prey polypeptide component of the interaction trap system. The 

5 linker can also facilitate the appropriate folding of each fragment to occur. The linker can 
be of natural origin, such as a sequence determined to exist in random coil between two 
domains of a protein. An exemplary linker sequence is the linker found between the C- 
terminal and N-tcrminal domains of the RNA polymerase a subunit. Other examples of 
naturally occurring linkers include linkers found in the Xc\ and LcxA proteins. 

10 Alternatively, the linker can be of synthetic origin. For instance, the sequence (Gly 4 Ser) 3 
can be used as a synthetic unstructured linker. Linkers of this type are described in Huston 
et al. (1988) PNAS 85:4879; and U.S. Patent No. 5,091,513, both incorporated by reference 
herein. Another exemplary embodiment includes a poly alanine sequence, e.g., (Ala) 3 . 

As set out above, the bait fusion protein should have little to no transcriptional 
15 activation ability by itself. In a preferred embodiment, a repression assay is carried out as a 
control to confirm that lack of transcriptional activation by the bait fusion protein is not 
simply because the fusion protein is mis-folded, or is sequestered in occlusion bodies. In 
one embodiment, the repression assay tests the ability of the fusion protein to competitively 
block transcription of a reporter gene construct containing a DBD recognition element. For 
20 example, a bait fusion protein including a DBD from PhoB can be validated, in part, by 
observing the ability of the fusion protein to inhibit, in the presence of wild-type PhoB, 
expression of a reporter gene operably linked to a pho box sequence. Where the bait fusion 
protein includes the DN A binding domain of Xcl the ability of the fusion protein to bind to 
a X operator sequence (e.g., which could serve as the DBD recognition element) can be 
25 validated by its ability to confer on an £ coli strain immunity to infection by X phage. 

IV. Prey protein constructs 

In preferred embodiments, the prey fusion protein comprises: (I) a target 
polypeptide sequence, capable of forming an intermolccular association with the bait 
30 polypeptide which is to be tested for such binding activity, and (2) an activation tag such as 
a PID. As described herein, the activation tag can be, for example, all or a portion of an 
RNA polymerase subunit, such as the polymerase interaction domain of the N-terminal 
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domain (a-NTD) of the RNA polymerase a subunit. As described above, protein-protein 
contact between the bait and prey fusion proteins (via the interacting bait and prey 
polypeptide portions of those proteins) links the DNA-binding domain of the bait fusion 
protein with the polymerase interaction domain of the prey fusion protein, generating a 
protein complex capable of directly recruiting a functional RNA polymerase enzyme to 
DN A sequences proximate to the DNA bound bait protein, i.e., to the reporter gene. 

DNA dependent RNA polymerase in £. coli and other bacteria consists of an 
enzymatic core composed of subunits a, p, and p' in the stoichiometry a 2 pp\ and one of 
several alternative a factors responsible for specific promoter recognition. In one 
embodiment, the prey fusion protein includes a sufficient portion of the amino-terminal 
domain of the a subunit to permit assembly of transcriptionally active RNA polymerase 
complexes which include the prey fusion protein. The a subunit, which initiates the 
assembly of RNA polymerase by forming a dimer, has two independently folded domains 
(Ebrighl et al. (1995) Curr Opin Genet Dev 5:197). The larger amino-terminal domain (o> 
NTD) mediates dimerization and the subsequent assembly of the polymerase complex. The 
prey polypeptide can be fused in frame to the a-NTD (see appended examples) or a 
fragment thereof which retains the ability to assemble a functional RNA polymerase 
complex. 

To further illustrate the ability of the a subunit to be utilized in the subject ITS, the 
coding sequence for a-NTD was fused to the coding sequence for the yeast protein 
GALll p , a mutant form of GAL1 1. See Figure 2A and Himmelfarb et al. (1990) Cell 
63:1299-309. The "P" mutation confers upon GAL11, a component of the RNA 
polymerase II holoenzyme in yeast, the ability to interact with a portion of the dimerization 
region of GALA We also constructed a fusion protein comprised of the Xcl protein having 
GAL4 fused at its C-terminus. As demonstrated in Figure 2B ? the co-expression of both 
fusion proteins can activate the expression of a reporter gene under the transcriptional 
control of a A.cl operator. Substitution of the wildtype GAL 1 1 sequence for the GAL 1 1 p 
sequence result in loss of transcriptional activity of the co-expressed fusion proteins. 

Figure 4 similarly illustrates the use of the a-NTD. In that embodiment, p53 was 
fused to both a-NTD and to the DBD of Xcl. The p53 protein includes, in its carboxy 
terminus, an oligomerization domain which mediates formation of p53 homodimers and 
heterodimers. As demonstrated in Figure 4, the co-expression of both fusion proteins can 
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activate the expression of a reporter gene under the transcriptional control of a Xc\ operator, 
presumably by p53-mediated oligomerization (e.g., dimerization and/or tetramerization). 
. Expression of only the p53/Xcl, e.g., in the presence of the wildtype a subunit. did not 
activate expression of the reporter gene above basal levels. 

The present invention also contemplates the use of polymerase interaction domains 
containing portions of other RNA polymerase subunils or portions of molecules which 
associate with an RNA polymerase subunit or subunits. Contemporary models of the 
polymerase complex predict a substantial degree of intramolecular motion within the 
transcription complex. Movement of parts of the enzyme complex relative to each other is 
believed to be realized by structurally independent domains, such as the N-terminal and C- 
terminal domains of the a subunit described above. Accordingly, it is possible that the 
paradigm of transcriptional activation realized with fusion proteins incorporating only a 
portion of the a subunit is also applicable to fusion proteins generated with portions of other 
polymerase subunits, preferably subunits which are an integral part of or tightly associated 
with the polymerase complex, e.g., such as the p, p\ o and/or a subunits. The use of 
portions of such other subunits to generate a prey fusion protein arc, like the a-NTD 
example above, expected to provide fusion proteins which retain the ability to form active 
polymerase complexes. For example, Severinov et al. (1995) PNAS 92:4591 describes the 
ability of fragments of the p subunit (encoded by the E coli rpoB gene) to reconstitute a 
functional polymerase enzyme. It is noted that it may be a formal requirement of 
embodiments utilizing prey fusion proteins including PIDS of the p, p* or a subunits that 
other fragments of the subunit be provided, e.g., co-expressed, in the host cell. 

To further illustrate such equivalents, it is noted that highly purified K coli RNA 
polymerase contains a small subunit termed omega (co). See Figure 3A This subunit 
consists of 91 amino acids with a molecular weight of 10,105. It's cloning has been 
previously reported (Gentry et al. (1986) Gene 48:33-40). We fused the a coding sequence 
in frame to the C-terminus of Xcl. See Figure 3B. In bacterial strains lacking wildtype co, 
the Xcl-co fusion protein was able to drive expression of a p-gal reporter gene having a Xcl 
operator. Figure 3C illustrates that Xcl itself was unable efficiently induce expression of the 
reporter gene. Moreover, wildtype co can effectively compete for binding to the holoenzyme 
complex, and can inhibit the ability of Xcl-co to induce expression of the reporter gene. 
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To demonstrate the ability of the o> suhunit to be utilized in the subject ITS, the 
coding sequence for a) was fused to the coding sequence for GAL1 l p . See Figure 3D. We 
also constructed a fusion protein comprised of the Xcl protein having GAL4 fused at its C- 
tcrminus. As demonstrated in Figure 3E, the co-expression of both fusion proteins can 
activate the expression of a reporter gene under the transcriptional control of a Xc\ operator. 
Substitution of the wildtype GAL1 1 sequence for the GAL1 I p sequence result in loss of 
transcriptional activity of the co-expressed fusion proteins. 

Additionally, given the general conservation of the polymerase subunits amongst 
bacteria, the present invention also specifically contemplates prey fusion proteins derived 
with polymerase interaction domains of RNA polymerase subunits from other bacteria, e.g.. 
Staphylococcus aureus (Deora et al. (1995) Biochem Biophys Res Commun 208:610), 
Bacillus subtilis, etc. 

In an alternative embodiment, instead of a polymerase interaction domain, the prey 
fusion protein can include an activation domain of a transcriptional activator protein. The 
bait fusion protein, by forming DNA bound complexes with the prey fusion protein, can 
indirectly recruit RNA polymerase complexes to the promoter sequences of the reporter 
gene, thus activating transcription of the reporter gene. To illustrate, the activation domain 
can be derived from such transcription factors as PhoB or OmpR. The critical consideration 
in the choice of the activation domain is its ability to interact with RNA polymerase 
subunits or complexes in the host cell in such a way as to be able to activate transcription of 
the reporter gene. 

The prey fusion proteins can differ in the polymerase interaction domains or target 
surfaces they include, and in whether they contain other useful moieties such as epitope 
tags, oligomerization domain, etc. There are also a wide variety of prey polypeptides which 
can be selected to generate the fusion protein. The prey polypeptide can be derived from all 
or a portion of a known protein or a mutant thereof, all or a portion of an unknown protein 
(e.g., encoded by a gene cloned from a cDNA library), or a random polypeptide sequence 
(or be a random sequence included in a larger polypeptide sequence). 

To isolate DNA sequences encoding novel interacting proteins, members of a DNA 
expression library (e.g., a cDNA or synthetic DNA library, cither random or intentionally 
biased) can be fused in-frame to the activation tag (e.g., the polymerase interaction domain 
or activation domain) to generate a variegated library of prey fusion proteins. Those 
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library-encoded proteins that physically interact with the promoter-bound bait fusion protein 
detectably activate expression of the reporter gene and provide a ready assay for identifying 
a particular DNA clone encoding an interacting protein of interest. 

In an exemplary embodiment, cDNAs may be constructed from any mRNA 
5 population and inserted into an equivalent expression vector. Such a library of choice may 
be constructed de novo using commercially available kits (e.g., from Stratagene, La Jolla, 
CA) or using well established preparative procedures (see, for example, Current Protocols 
in Molecular Biology . Eds. Ausubel ct al. John Wiley & Sons: 1992), Alternatively, a 
number of cDNA libraries (from a number of different organism^: ^re publicly and 

10 commercially available; sources of libraries include, e.g., Clonlech (Palo Alto, CA) and 
Stratagene (La Jolla, CA). It is also noted that prey polypeptide need not be naturally 
occurring full-length proteins. In preferred embodiments, prey proteins are encoded by 
synthetic DNA sequences, are the products of randomly generated open reading frames, are 
open reading frames synthesized with an intentional sequence bias, or are portions thereof 

15 Preferably,, such short randomly generated sequences encode peptides between, for 
example, 4 and 60 amino acids in length. 

It will be appreciated by those skilled in the art that many variations of the prey and 
bait fusion proteins can be constructed and should be considered within the scope of the 
present invention. For example, it will be understood that, for screening polypeptide 

20 libraries, the identity of the prey polypeptide can be fixed and the bait protein can be varied 
to generate the library. Indeed, in certain embodiments it will be desirable to derive the 
prey fusion protein with a fixed prey polypeptide rather than a variegated library on the 
grounds that the single prey fusion protein can be easily tested for its ability to be assembled 
into a functional RNA polymerase enzyme. Moreover, where the prey fusion protein is 

25 derived with a polymerase interaction domain, the bait fusion protein is likely to be less 
sensitive to variations caused by the different peptides of the library than is the prey fusion 
protein. In such embodiments, a variegated bait polypeptide library can be used to create a 
library of bait fusion proteins to be tested for interaction with a particular prey protein. 

While it will generally be desirable for the DBD and bait polypeptide portions of the 
30 bait fusion protein, and activation tag and prey polypeptide portions of the prey fusion 
protein to be derived from different, e.g., heterologous, proteins, the present invention also 
contemplates embodiments of the instant assay wherein one of the two bait or prey proteins 
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is a naturally occurring protein rather than a heterologous fusion protein. As an illustration, 
the bait protein can be a dimeric transcriptional activator which undergoes a higher order 
tetramerization reaction. That dimcr-dimer interaction can be selected as the target of an 
assay to identify an agent which selectively disrupts the intcr-dimer contacts. In such 
embodiments, the full-length transcriptional activator can serve the role of the bait protein, 
and the prey fusion protein can include, for example, that portion of the transcriptional 
activator which is involved in the formation of tetrameric complexes. 

Moreover, either or both the prey and bait proteins, if desired, may include epitope 
tags (e.g., portions of the c-myc protein or the flag epitope available from Immunex). The 
epitope tag can facilitate a simple immunoassay for fusion protein expression, e.g. to detect 
the presence and folding of the fusion protein. 

In other embodiments of the subject ITS, particularly those in which a polypeptide 
library is displayed on either the bait or prey protein, the fusion proteins can be generated to 
include, in addition to the test polypeptide sequences, a polypeptide sequence with another 
known polypeptide sequence. Thus, a prey fusion protein can be generated having the 
following exemplary formula: A-B-C, where A is an a-NTD, B is a control binding 
sequence (such as the C terminal domain [CTD] of Xcl), and C is the test polypeptide 
sequence. To assure oneself that the fusion protein is correctly folded, the fusion protein 
can be first tested in an ITS using Xcl CTD in the bait protein -the C terminal domain 
included in the prey protein providing a means for binding (by dimerization) with the bait. 
Prey fusion proteins which pass this control ITS can then be sampled in an ITS wherein bait 
is constructed with test polypeptide(s). Of course it will be appreciated that the order of the 
control and test polypeptides can be reversed. 

In other embodiments, the construct encoding the prey (or bait) fusion protein can 
include a promoter for in vitro translation (e.g., a T7 promoter) of the target polypeptide, 
c.f., Yavuzer et al. (1995) Gene 165:93. Such constructs can be used to eliminate 
subcloning steps necessary to carry out certain validation assays often undertaken after the 
initial identification of the protein in the interaction trap, e.g., to determine if the binding of 
the two hybrid proteins is truly the result of an interaction between the bait and prey 
polypeptides per se. 

In another aspect of the present invention, the DNA sequence encoding the prey 
protein (or alternatively the bait protein) is embedded in a DNA sequence encoding a 
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conformation-constraining protein (i.e., a protein that decreases the flexibility of the amino 
and carboxy termini of the prey protein). Such embodiments are preferred where the prey 
polypeptide is a relatively short peptide, e.g.. 5-25 amino acid residues. In general, 
conformation-constraining proteins act as scaffolds or platforms, which limit the number of 
5 possible three dimensional configurations the peptide or protein of interest is free to adopt. 
Preferred examples of conformation-constraining proteins arc thiorcdoxin or other 
thioredoxin-like sequences, but many other proteins are also useful for this purpose. 
Preferably, conformation-constraining proteins arc small in size (generally, less than or 
equal to 200 amino acids), rigid in structure, of known three dimensional configuration, and 
10 are able to accommodate insertions of proteins of interest without undue disruption of their 
structures. A key feature of such proteins is the availability, on their solvent exposed 
surfaces, oflocations where peptide insertions can be made (e.g., the thioredoxin active-site 
loop). 

As mentioned above, one preferred conformation-constraining protein according to 
15 the invention is thioredoxin or other thioredoxin-like proteins. The three dimensional 
structure of £ coll thioredoxin is known and contains several surface loops, including a 
distinctive Cys-Cys active-site loop between residues Cys33 and Cys36 which protrudes 
from the body of the protein. This Cys-Cys active-site loop is an identifiable, accessible 
surface loop region and is not involved in interactions with the rest of the protein which 
20 contribute to overall structural stability It is therefore a good candidate as a site for prey 
protein insertions. Both the amino- and carboxyl-termini of £ coli thiorcdoxin are on the 
surface of the protein and are also readily accessible for fusion construction. 

It may be preferred for a variety of reasons that prey (or bait) polypeptides be fused 
within the active-site loop of thiorcdoxin or thioredoxin-like molecules. The face of 

25 thioredoxin surrounding the active-site loop has evolved, in keeping with the protein's major 
function as a nonspecific protein disulfide oxido-reductase, to be able to interact with a wide 
variety of protein surfaces. The active-site loop region is found between segments of strong 
secondary structure and this provides a rigid platform to which one may tether prey 
proteins. A small prey protein inserted into the active-site loop of a thioredoxin-like protein 

30 is present in a region of the protein which is not involved in maintaining tertiary structure. 
Therefore the structure of such a fusion protein is stable. Thus, relatively short peptides may 
be displayed as part of the prey fusion protein by virtue of the fusion of the thioredoxin 
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protein to a polymerase interaction domain. Such embodiments are useful for screening 
peptide libraries for interactors with a particular target bait protein. 

The subject assay can also be used to generate antibody equivalents for specific 
determinants, e.g., such as single chain antibodies, minibodics or the like. Indeed, the 
subject method can be used to identify a novel binding partner for a given 
epitopc/detcrminant where the new binding partner is a completely artificial polypeptide. 
For example, a target polypeptide (or epitope thereof) for which an antibody or antibody 
equivalent is sought can be displayed on either the bait or prey fusion protein. A library of 
potential binding partners can be arrayed on the other fusion protein, as appropriate. 
Interactions between the target polypeptide and members of the library of binding partners 
can be detected according to methods described herein. Thus, the present invention 
provides a convenient method for identifying recombinant nucleic acid sequences which 
encode proteins useful in the replacement of, e.g., monoclonal antibodies. 

In another embodiment of the subject ITS, the system can be used to identify 
proteolytic activities which cleave a given polypeptide sequence, or to identify the sequence 
specificity for a given protease. For example, in the embodiment of the subject ITS 
illustrated in Figure IB, a desired cleavage sequence can be introduced into the bait or prey 
fusion proteins such that, upon cleavage of the fusion protein at that sequence, the DNA 
localization of the prey protein is lost. To further illustrate, a substrate sequence for a 
proteolytic activity is desired can be engineered into the linker sequence separating the N- 
and C-terminal domains of the bait protein shown in Figure IB. In the absence of 
proteolysis of that sequence, the intact prey and bait proteins induce expression of a reporter 
gene (or "inverter" gene as appropriate). The presence in the cell of a proteolytic activity 
which recognizes the substrate sequence can result in cleavage of the bait protein, separating 
the DBD from that portion of the protein which interacts with the prey fusion protein. Such 
embodiments of the ITS can be used to screen libraries of proteolytic proteins, e.g., derived 
from cDNA libraries, catalytic antibodies, or generated by combinatorial mutagenesis of 
existing enzymes. 

In other embodiments, peptide libraries can be engineered into one of the fusion 
proteins and proteolysis of the fusion protein by a predetermined proteolytic activity used to 
identify the sequence specificity of the proteolytic activity and/or optimize the sequence for 
a substrate or inhibitor for the proteolytic activity. For example, a variety of proteases have 
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been identified as being involved in various disease states. In many instances, the substrate 
specificity for a protease has not yet been fully determined or optimized. Utilizing the 
subject ITS, the substrate specificity for a given protease can be accurately determined, and 
selective substrates or inhibitors, as appropriate, can be developed based on that sequence 
5 information. 

In still other embodiments, the subject ITS can be derived to score for heteromeric 
combinations of three or more proteins by providing two or more different bait fusion 
proteins and/or two or more different prey fusion proteins in the same system, i.e., at least 
three different fusion proteins. This concept is illustrated by an example using a-NTD 
10 fusion proteins. 

The a subunit of E. coli RNA polymerase plays a key role in assembly of the core 
enzyme. In previous studies, it has been demonstrated that the holoenzyme includes two a 
subunits. only one of which interacts with p. Assembly-deficient mutants of a have been 
identified, such as a-R45A (having substituted Ala for Arg at residue 45). This mutant 

15 dimcrizes, but docs not assemble p subunits. See Kimura et al. (1995) J Mot Biol 254:342. 
When over-expressed in cells also expressing wildtypc a, the equilibrium of the system 
favors formation of holoenzyme complexes which a heterologous with respect to a, e.g., 
including one wildtype and one R45A mutant subunit. Thus, making fusion proteins with a 
DNA binding domain, and with each of the wildtype and R45A N-NTDs, the system can 

20 accommodate three different polypeptide sequences which can be tested for simultaneous 
interactions. In other embodiments, fusing the same polypeptide sequence to the two 
different a-NTD sequences can be used to distinguish oligomerization mechanisms, e.g., 
distinguish tetramerization from pairwise dimerization. 

25 V. Reporter gene constructs 

The reporter gene of this invention ultimately measures the end stage of the above 
described cascade of events, e.g., transcriptional modulation, and, if desired, permits the 
isolation of ITS cells on the basis of that criteria. Accordingly, in practicing one 
embodiment of the assay, a reporter gene construct is inserted into the reagent cell in order 
30 to generate a detection signal dependent on interaction of the bait and prey fusion proteins. 
Typically, the reporter gene construct will include a reporter gene in operative linkage with 
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one or more transcriptional regulatory elements which include, or are linked to, a DBD 
recognition element for the DBD of the bait fusion protein, with the level of expression of 
the reporter gene providing the prey protein interaction-dependent detection signal. Many 
reporter genes and transcriptional regulatory elements useful in the subject flow-ITS arc 
5 known to those of skill in the art and others may be readily identified or synthesized. 
Moreover, DBD recognition elements are known in the art for a wide variety of DNA 
binding domains which may used to construct the bait proteins of the present invention. 
Exemplary recognition elements include the \ operator, 'the LexA operator, the pho box, 
and the like. 

10 A "reporter gene" includes any gene that expresses a detectable gene product, which 

may be RNA or protein. Preferred reporter genes are those that are readily detectable. The 
reporter gene may also be included in the construct in the form of a fusion gene with a gene 
that includes desired transcriptional regulatory sequences or exhibits other desirable 
properties. 

15 Examples of reporter genes include, but are not limited to CAT (chloramphenicol 

acetyl transferase) (Alton and Vapnek (1979), Nature 282: 864-869) luciferase, and other 
enzyme detection systems, such as beta-galactosidase; firefly luciferase (deWet ct al. 
(1987), Mol. Cell. Biol. 7:725-737); bacterial luciferase (Engebrecht and Silverman (1984), 
PNAS 1: 4154-4158; Baldwin et ah (1984), Biochemistry 23: 3663-3667); 

20 phycobiliproteins (especially phycoerythrin); green fluorescent protein (GFP: see Valdivia 
et al. (1996) Mol Microbiol 22: 367-78; Cormack et al. (1996) Gene 173 (1 Spec No): 33-8; 
and Fey et al. (1995) Gene 165:127-130; alkaline phosphatase (Toh et al. (1989) Eur. J. 
Biochem. 182: 231-238, Hall et al. (1983) J. Mol. Appl. Gen. 2: 101), secreted alkaline 
phosphatase (Cullen and Malim (1992) Methods in Enzymol. 216:362-368). Other 

25 examples of suitable reporter genes include those which encode proteins conferring 
drug/antibiotic resistance to the host bacterial cell, or which encode proteins required to 
complement an auxotrophic phenotype. A preferred reporter gene is the spc gene, which 
confers resistance to spectinomycin. 

The amount of transcription from the reporter gene may be measured using any 
30 method known to those of skill in the art to be suitable. For example, specific mRNA 
expression may be detected using Northern blots or specific protein product may be 
identified by a characteristic stain or an intrinsic activity. 
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In preferred embodiments, the gene product of the reporter is detected by an intrinsic 
activity associated with that product. For instance, the reporter gene may encode a gene 
product that, by enzymatic activity, gives rise to a detection signal based on color, 
fluorescence, or luminescence. 

The amount of expression from the reporter gene is then compared to the amount of 
expression in either the same cell in the absence of the test compound or it may be 
compared with the amount of transcription in a substantially identical cell that lacks 
heterologous DNA, such as the gene encoding the prey fusion protein. Any statistically or 
otherwise significant difference in the amount of transcription indicates that the prey fusion 
protein interacts with the bait fusion protein. 

In other preferred embodiments, the reporter or marker gene provides a selection 
method such that cells in which the reporter gene is activated have a growth advantage. For 
example the reporter could enhance cell viability, e.g., by relieving a cell nutritional 
requirement, and/or provide resistance to a drug. For example the reporter gene could 
encode a gene product which confers the ability to grow in the presence ofa selective agent, 
e.g., chloramphenicol or kanamycin. 

In bacteria, suitable positively selectable (beneficial) genes include genes involved 
in biosynthesis or drug resistance. Countless other genes are potential selective markers. 
Certain of the above are involved in well-characterized biosynthetic pathways. In the 
simplest case, the cell is auxotrophic for an amino acid, such as histidine (requires histidinc 
for growth), in the absence of activation of the reporter gene. Activation leads to synthesis 
of an enzyme required for biosynthesis of the amino acid and the cell becomes prototrophic 
for that amino acid (does not require an exogenous source). Thus the selection is for growth 
in the absence of that amino acid in the culture media. 

Another class of useful reporter genes encode cell surface proteins for which 
antibodies or ligands are available. Expression of the reporter gene allows cells to be 
detected or affinity purified by the presence of the surface protein. 

In appropriate assays, so-called counterselectable or negatively selectable genes 
may be used. 

The marker gene may also be a screenablc gene. The screened characteristic may be 
a change in cell morphology, metabolism or other screenable features. Suitable markers 
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include P-galactosidase, alkaline phosphatase, horseradish peroxidase, lucifcrasc, bacterial 
green fluorescent protein,; secreted alkaline phosphatase (SEAP); and chloramphenicol 
transferase (CAT). Some of the above can be engineered so that they are secreted (although 
not p-galactosidase). A preferred screenable marker gene is p-galactosidase; bacterial cells 
expressing the enzyme convert the colorless substrate Xgal into a blue pigment. 

In general, many of the embodiments of the ITS described above rely upon 
expression the reporter as a positive readout, typically manifested either ( 1 ) as an enzyme 
activity (e.g., P-galactosidase) or (2) as enhanced cell growth on a defined medium (e.g., 
antibiotic resistance). Thus, these methods are suited for identifying a positive interaction of 
the bait and prey polypeptides, but are not well suited for identifying agents or conditions 
which inhibit intermolecular association between two polypeptide sequences. In part, this is 
because a failure to obtain expression of the reporter gene can result from many events 
which do not stem from a specific inhibition of binding of the two hybrid proteins. For 
example, an ITS using a reporter gene that stimulates growth under defined conditions 
theoretically can be used to screen for agents that inhibit the intermolecular association of 
the two hybrid proteins, but it will be difficult or impossible to discriminate agents that 
specifically inhibit the association of the two hybrid proteins from agents which simply 
inhibit cell growth. Thus, an agent which is cytotoxic to the bacterial cell will prevent cell 
growth without specifically inhibiting the interaction of two hybrid proteins and will score 
falsely as a positive hit. Similarly, an ITS using a lacZ reporter gene or the like, or a 
cytotoxic gene, will falsely score general transcription or translation inhibitors as being 
inhibitors of two hybrid protein binding. Thus, ITS embodiments that produce a positive 
readout contingent upon intermolecular binding of the bait and prey proteins are generally 
not suitable for screening for agents which inhibit binding of the two hybrid proteins. 

To avoid such confounding results, the ITS format can be modified slightly to 
provide a "reverse ITS". In the reverse ITS, the reporter gene encodes a transcriptional 
repressor which is expressed upon interaction of the bait and prey proteins. However, the 
host cell also includes a second reporter gene which, but for an operator sequence 
responsive to the repressor protein produced by the first reporter gene, would otherwise be 
expressed. Thus, the gene product of the first reporter gene regulates expression of the 
second reporter gene, the expression of the latter provides a means for indirectly scoring for 
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the expression of the former. Essentially, the first reporter gene can be seen as a signal 
inverter. 

In this exemplary system, the bait and prey proteins positively regulate expression of 
the first reporter gene. Accordingly, where the first reporter gene is a repressor of 
5 expression of the second reporter gene, relieving expression of the first reporter gene by 
inhibiting the formation of complexes between the bait and prey proteins concomitantly 
relieves inhibition of the second reporter gene. For example, the first reporter gene can 
include the coding sequences for Xcl. The second reporter gene can accordingly be a 
positive signal, such as providing for growth (e.g., drug selection or auxotrophic relief), and 

10 is under the control of a promoter which is constilutively active, but can be repressed by 
Acl. In the absence of an agent which inhibits the interaction of the bait and prey protein, 
the Xcl protein is expressed. In turn, that protein represses expression of the second reporter 
gene. However, an agent which disrupts binding of the bait and prey proteins results 'in a 
decrease in Xcl expression, and consequently an increase in expression of ihc second 

IS reporter gene as Xcl repression is relieved. Hence, the signal is inverted. \ 

In yet another embodiment for detecting agents which disrupt the bit prey 
interaction, it is envisioned that under certain conditions the interaction between baikand 
prey fusion proteins might result in transcription repression rather than activation, For. 
example, it is speculated that sufficiently strong binding between a bait fusion protein and a 

20 prey fusion protein may impede the escape of the polymerase from the promoter, which 
escape is required for elongation of a transcript, thus repressing transcription. In particular, 
a strong interaction between the bait and protein proteins, combined with a strong promoter 
(e.g., one which is more efficient at binding the polymerase complex even in the absence of 
transcription factors) can result in repression of reporter gene expression. Under these 

25 conditions an inhibitor of bait-prey complex formation will, over a certain concentration 
range, cause the effective association constant of the complex to be reduced sufficiently to 
result in relief of the repression and concomitant transcription of the reporter gene. At 
higher concentrations, inhibitors of the bait-prey complex may result in inhibition (or return 
to basal levels) of transcription by the loss of bait-prey complexes. Thus, in one 

30 embodiment, the candidate agent can be spotted on a lawn of reagent cells plated on a solid 
media. The diffusion of the candidate agent through the solid medium surrounding the site 
at which it was spotted will create a diffusional effect. For agents which inhibit the 
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formation of bait-prcy complexes, a halo of reporter gene expression would be expected in 
an area which corresponds to concentrations of the agent which offset the effect of the 
repression due to strong association between the two hybrid proteins, but which arc not so 
great as to substantially inhibit the formation of bait-prey complexes. 

Still another consideration in generating the reporter gene construct concerns the 
placement of the DBD recognition element relative to the reporter gene and other 
transcriptional elements with which it is associated. In most embodiments, it will be 
desirable to position the recognition element at an inert position. In some instances, the 
axial position of the DBD relative to the promoter sequences can be important. 

In certain embodiments, the sensitivity of the ITS can be enhanced for detecting 
weak protein-protein interactions by placing the DBD recognition sequence at a position 
permitting secondary interactions (if any) between other portions of the bait fusion protein 
and the RNA polymerase complex. For example, as described in the appended examples, 
an apparent synergistic effect was observed when the X operator was moved close to or at its 
normal position. While not wishing to be bound by any particular theory, this synergism is 
speculated to be the result of a bait-prcy interaction and second interaction between DBD of 
Xc\ and a second polymerase subunit (a). 

It will also be understood by those skilled in the art that the sensitivity to the 
strength of the interactions between the bait and prey proteins can be "tuned" by adjusting 
the sequence of the recognition element. For example, the use of a strong X operator instead 
of weak can improve the sensitivity of the assay to weak bait-prey interactions, as well as 
help to overcome lack of dimerization if no dimerization signals arc included in the bait 
fusion protein. 

In particular embodiments, it may desirable to provide two or more reporter gene 
constructs which are regulated by interaction of the bait and prey proteins. The 
simultaneous expression of the various reporter genes (whether provided on the same or 
separate plasmids) provides a means for distinguishing actual interaction of the bait and 
prey proteins from, e.g., mutations or other spurious activation of the reporter gene. 

VI Host cells 

Exemplary prokaryotic host cells are gram-negative bacteria such as Escherichia 
coli y or gram-positive bacteria such as Bacillus subtil is. 
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Recognized prokaryolic hosts include bacterial strains or Escherichia. Bacillus, 
Streptomyces\ Pseudomonas, Salmonella, Serratia, Shigella and the like. The prokaryotic 
host must be compatible with the replicon and control sequences in the expression plasmid. 

Preferred prokaryotic host cells for use in carrying out the present invention are 
5 strains of the bacteria Escherichia, although Bacillus and other genera arc also useful. 
Techniques for transforming these hosts and expressing foreign genes cloned in them are 
well known in the art (see e.g., Maniatis ct al. and Sambrook et al., ibid.). Vectors used for 
expressing foreign genes in bacterial hosts will generally contain a selectable marker, such 
as a gene for antibiotic resistance, and a promoter which functions in the host cell. 

10 Appropriate promoters including trp (Nicholset al. (1983) Metk Enzymol. 101:155-164), 
lac (Casadaban ct al. (1980) J. Bacterid. 143:971-980), and phage gamma promoter 
systems (Queen (1983) J. Mol. Appi Genet. 2:1-10). Plasmids useful for transforming 
bacteria include pBR322 (Bolivar et al. (1977) dene 2:95-1 13), the pUC plasmids (Messing 
(1983) Metk Enzymol 101:20-77), Vieira and Messing (1982) Gene 19:259-268), pCQV2 

15 (Queen, sapra\ pACYC plasmids (Chang et al. (1978) J Bacterial 134:1141), pRW 
plasmids (Lodge et al. (1992) FEMS Microbiol Lett 95:271), and derivatives thereof. 

The choice of appropriate host cell will also be influenced by the choice of detection 
signal. For instance, reporter constructs, as described below, can provide a selectable or 
screenable trait upon transcriptional activation (or inactivation). The reporter gene may be 

20 an unmodified gene already in the host cell pathway, such as sporulation genes. It may be a 
host cell gene that has been operably linked to a "bait-responsive" promoter. Alternatively, 
it may be a heterologous gene that has been so linked. Suitable genes and promoters are 
discussed above. Accordingly, it will be understood that to achieve selection or screening, 
the host cell must have an appropriate phenotype. For example, introducing a histidinc 

25 biosynthesis gene into a yeast that has a wild-type form of that gene would frustrate genetic 
selection. Thus, to achieve nutritional selection, an auxotrophic strain will be desired which 
is complemented by expression of the reporter gene. 

In other embodiments, the host cell can be a eukaryotic cell, particularly a yeast cell, 
which has been engineered to express a sufficient number of the bacterial polymerase 
30 subunits necessary to induce (reporter) gene expression in the cell in a manner dependent on 
the bait and prey proteins and the bacterial RNA polymerase subunits. It may be desirable 
in such embodiments to include a nuclear localization signal as part of one or more of the 
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bacterial proteins. Regulatory sequences for the recombinant expression of these proteins in 
cukaryotic cells may also need to be optimized. 

Vll Exemplary Uses of the Prokaryoiic ITS 

5 The prokaryoiic ITS of the present invention can be used, inter alia, for 

identifying protein-protein interactions, e.g., for generating protein linkage maps, for 
identifying therapeutic targets, and/or for general cloning strategies. As described above, 
the ITS can be derived with a cDNA library to produce a variegated array of bait or prey 
proteins which can be screened for interaction with, for example, a known protein expressed 

10 as the corresponding fusion protein in the ITS. In other embodiments, both the bait and 
prey proteins can be derived to each provide variegated libraries of polypeptide sequences. 
One or both libraries can be generated by random or semi-random mutagenesis. For 
example, random libraries of polypeptide sequences can be "crossed" with one another by 
simultaneous expression in the subject assay. Such embodiments can be used to identify 

15 novel binding pairs of polypeptides. 

Alternatively, the subject ITS can be used to map residues of a protein involved in a 
known protein-protein interaction. Thus, for example, various forms of mutagenesis can be 
utilized to generate a combinatorial library of either bait or prey polypeptides, and the 
ability of the corresponding fusion protein to function in the ITS can be assayed. Mutations 

20 which result in diminished (or potentiated) binding between the bait and prey fusion 
proteins can be delected by the level of reporter gene activity. For example, mutants of a 
particular protein which alter interaction of that protein with another protein can be 
generated and isolated from a library created, for example, by alanine scanning mutagenesis 
and the like (Ruf et ai„ (1994) Biochemistry 33:1565-1572; Wang et aL (1994) J. Biol. 

25 Chcm. 269:3095-3099; Balint et al, (1993) Gene 137:109-1 18; Grodberg et al., (1993) Eur. 
J. Biochem. 218:597-601; Nagashima et al., (1993) J. Biol. Chem. 268:2888-2892; Lowman 
et al., (1991) Biochemistry 30:10832-10838; and Cunningham et al., (1989) Science 
244:1081-1085), by linker scanning mutagenesis (Gustin et al., (1993) Virology 193:653- 
660; Brown et al., (1992) Mol. Cell Biol. 12:2644-2652; McKnight et al., (1982) Science 

30 232:316); by saturation mutagenesis (Meyers et al., (1986) Science 232:613): by PCR 
mutagenesis (Leung et al., (1989) Method Cell Mol Biol 1:11-19); or by random 
mutagenesis (Miller et al., (1992) A Short Course in Bacterial Genetics, CSHL Press, Cold 



WO 98/07845 



-36- 



PCT/US97/14860 



Spring Harbor, NY; and Greener ct aL (1994) Strategies in Mo! Biol 7:32-34). Linker 
scanning mutagenesis, particularly in a combinatorial setting, is an attractive method for 
identifying truncated (bioactive) forms of a protein, e.g., to establish binding domains. 

In other embodiments, the ITS can be designed for the isolation of genes encoding 
5 proteins which physically interact with a protein/drug complex. The method relies on 
detecting the reconstitution of a transcriptional activator in the presence of the drug, such as 
rapamycin, FK506 or cyclosporin. If the bait and prey fusion proteins are able to interact in 
a drug-dependent manner, the interaction may be delected by reporter gene expression. 

Another aspect of the present invention relates to the use of the prokaryotic ITS in 
10 the development of assays which can be used to screen for drugs which are cither agonists 
or antagonists of a protein-protein interaction of therapeutic consequence. In a general 
sense, the assay evaluates the ability of a compound to modulate binding between the bait 
and prey polypeptides. Exemplary compounds which can be screened include peptides, 
nucleic acids, carbohydrates, small organic molecules, and natural product extract libraries, 
15 such as isolated from animals, plants, fungus and/or microbes. 

In many drug screening programs which test libraries of compounds and natural 
extracts, high throughput assays arc desirable in order to maximize the number of 
compounds surveyed in a given period of time. The subject ITS-dcrived screening assays 
can be carried out in such a format, and accordingly may be used as a "primary" screen. 

20 Accordingly, in an exemplary screening assay of the present invention, an ITS is generated 
to include specific bait and prey fusion proteins known to interact, and compound(s) of 
interest. Detection and quantification of reporter gene expression provides a means for 
determining a compound's efficacy at inhibiting (or potentiating) interaction between the 
bait and prey polypeptides. In certain embodiments, the approximate efficacy of the 

25 compound can be assessed by generating dose response curves from reporter gene 
expression data obtained using various concentrations of the test compound. Moreover, a 
control assay can also be performed to provide a baseline for comparison. In the control 
assay, expression of the reporter gene is quantitated in the absence of the test compound. 

In an illustrative embodiment, the ITS assay can be used to identify cyclophilin or 
30 rapamycin mimetics by screening for agents which potentiate the interaction of an FK506 
binding protein (FKBP) and a cyclophilin or TORI protein. For example, rapamycin-like 
drugs can be identified by the present invention which have enhanced tissue-type or cell- 
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type specificity relative to rapamycin. The identification of such compounds can be 
enhanced by the use of differential screening techniques which detect and compare drug- 
mediated formation of two or more different types of FKBP/cyclophilin or FKBP/TOR 
complexes. To further illustrate, by side-by-sidc comparison of assays generated with 
5 mammalian and yeast proteins, the subject ITS can be used to identify rapamycin mimetics 
which preferentially inhibit proliferation of yeast cells or other lower eukaryotes, but which 
have a substantially reduced effect on mammalian cells, thereby improving therapeutic 
index of the drug as an anti-mycotic agent relative to rapamycin. 

In another exemplary embodiment, a therapeutic target devised as the bait-prey 
10 complex is contacted with a peptide library with the goal of identifying peptides which 
potentiate or inhibit the bait-prey interaction. Many techniques arc known in the art for 
expression peptide libraries intracellular^. In one embodiment, the peptide library is 
provided as part of a chimeric thioredoxin protein, e.g., expressed as part of the active loop 
(supra). 

15 In yet another embodiment, the bacterial ITS can be generated in the form of a 

diagnostic assay to detect the interaction of two proteins, e.g., e.g., where the gene from one 
is isolated from a biopsied cell. For instance, there are many instances where it is desirable 
to detect mutants which, while expressed at appreciable levels in the cell, are defective at 
binding other cellular proteins. Such mutants may arise, for example, from fine mutations, 

20 e.g., point mutants, which may be impractical to detect by the diagnostic DNA sequencing 
techniques or by the immunoassays. The present invention accordingly further 
contemplates diagnostic screening assays which generally comprise cloning one or more 
cDNAs from a sample of cells, and expressing the cloned gene(s) as part of an ITS under 
conditions which permit detection of an interaction between that recombinant gene product 

25 and a target protein. Accordingly, the present invention provides a convenient method for 
diagnostically detecting mutations to genes encoding proteins which are unable to 
physically interact with a target "bait" protein, which method relies on detecting the 
reconstitution of a transcriptional activator in a bait/prey-dependent fashion. 

To illustrate, the subject ITS can be used to detect inactivating mutations of the 
30 CDK4/pI6 INK4a interaction. Recent discoveries have brought several cell-cycle regulators 
into sharp focus as factors in human cancer. Among the most conspicuous types of 
molecule to emerge from ongoing studies in this field are the cycl in-dependent kinase 
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inhibitors such as p16. (Serrano ct al. (1993) Nature 366:704; and Okamoto et al. (1994) 
PNAS 91:1 1045) The pl6 protein has several hallmarks or a tumor suppressors and is 
perfectly positioned to regulate critical decisions in cell growth. The pi 6 gene appears to be 
a particularly significant target for mutation in sporadic tumors and in at least one form of 

5 hereditary cancer. In an exemplary embodiment of the diagnostic ITS, a first hybrid gene 
comprises the coding sequence for a DNA-binding domain fused in frame to the coding 
sequence for a bait protein, e.g., CDK4 or CDK6. The second hybrid protein encodes a 
polymerase interaction domain fused in frame to a gene encoding the sample protein, e.g. a 
pi 6 gene (cDNA) amplified from a cell sample of a patient. If the bait and sample proteins 

10 arc able to interact, e.g., form a CDK/pl6 complex, then RNA polymerase is recruited to the 
promoter of a reporter gene which is operably linked to a DBD recognition element, thereby 
causing expression of the reporter gene. 

Moreover, it will be apparent that the subject two hybrid assay can be used generally 
to detect mutations in other cellular proteins which disrupt protein-protein interactions. For 
15 example, it has been shown that the transcription factor E2F-4 is bound to the pi 30 pocket 
protein, and that such binding effectively suppresses E2F-4-mediated trans-activation 
required for control of Go/G| transition. Mutants which result in disruption of this 
interaction can be detected in the subject assay. 

Similarly, Rb and Rb-like proteins (such as pi 07) act to control cell-cycle 
20 progression through the formation of complexes with several cellular proteins. In fact, a 
recent article concerning familial retinoblastoma has reported a new class of Rb mutants 
found in retinal lesions, which mutants were defective in protein binding ("pocket") activity 
(see, for example, Kratzke ct al. (1994) Oncogene 9:1321-1326). Moreover, mutant forms 
of c-myc have been demonstrated in various lymphomas, e.g., Burkitt lymphomas, which 
25 mutants are resistant to pl07-mediated suppression. Accordingly, the diagnostic two hybrid 
assay of the present invention can be used to detect mutations in Rb or Rb-like proteins 
which disrupt binding to other cellular proteins, e.g., mvc, E2F, c-Abl, or upstream binding 
factor (UBF), or vice-versa. 

In another embodiment, the subject diagnostic assay can be employed to detect 
30 mutations which disrupt binding of the p53 protein with other cellular proteins, as for 
example, the Wilm's tumor suppresser protein WT1 . Recent observations by Maheswaran 
et al. (1993, PNAS 90:5100-5104) have demonstrated that p53 can physically interact with 
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WTI, and liiat this interaction modulates the ability of each protein to transactivatc their 
respective targets. In fact, in contrast to the proposed function of WTI as a transcriptional 
repressor, potent transcriptional activation by WTI of reporter genes driven by EGR1 in 
cells lacking wild type p53 indicates that transcriptional repression is not an intrinsic 
property of WTI. Instead, transcriptional repression by WTI may result from its interaction 
with p53. Accordingly, mutations in p53 which do not effect the cellular concentration of 
this protein, but which rather down regulate its ability to bind to and repress WTL may give 
rise to Wilm's tumors, and other disease states associated with deregulation of WTL 

In still another embodiment, the diagnostic two hybrid assay can be used to detect 
mutations in pairs of signal transduction proteins. For example, the present assay can be 
used to detect mutations in the ras protein or other cellular proteins which interact with ras, 
e.g., ras GTPasc activating proteins (GAPs). 

The method of the present invention, as described above, may be practiced using a 
kit for detecting interaction between a target protein and a sample protein. In an illustrative 
embodiment, the kit includes two vectors, a host cell, and (optionally) a set of primers for 
cloning one or more target proteins from a patient sample. The first vector may contain a 
promoter, a transcription termination signal, and other transcription and translation signals 
functionally associated with the first chimeric gene in order to direct the expression of the 
first chimeric gene. The first chimeric gene includes a DNA sequence that encodes a DNA- 
binding domain and a unique restriction site(s) for inserting a DNA sequence encoding the 
target protein or protein fragment in such a manner that the target protein is expressed as 
part of a hybrid protein with the DNA-binding domain. The first vector also includes a 
means for replicating itself (e.g., an origin of replication) in the host cell. In preferred 
embodiments, the first vector also includes a first marker gene, the expression of which in 
the host cell permits selection of cells containing the first marker gene from cells that do not 
contain the first marker gene. Preferably, the first vector is a plasmid. 

The kit also includes a second vector which contains a second chimeric gene. The 
second chimeric gene also includes a promoter and other relevant transcription and 
translation sequences to direct expression of the prey fusion protein. The second chimeric 
gene also includes a DNA sequence that encodes a polymerase interaction domain (or an 
activation domains) and a unique restriction site(s) to insert a DNA sequence encoding the 
sample protein, or fragment thereof, into the vector in such a manner that the target protein 
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is capable of being expressed as part of a hybrid protein with the polymerase interaction 
domain. 

In general, the kit will also be provided with one of the two vectors already 
including the bait protein. For example, the kit can be configured for detecting mutations to 
5 a p!6-genc which result in loss of binding to CDK4. Accordingly, the first vector could be 
provided with a CDK4 open reading frame fused in frame to the DNA-binding domain to 
provide a CDK4 bait protein. pl6-gcne open reading frames can be cloned from a cell 
sample and ligated into the second vector in frame with the polymerase interaction domain. 

Where the kit also provides primers for cloning a p!6-gcnc into the two hybrid assay 
10 vectors, the primers will preferably include restriction endonucleasc sites for facilitating 
ligation of the amplified gene into the insertion site flanking the DNA-binding domain or 
activating domain. 

Accordingly in using the kit, the interaction of the target protein and the sample 
protein in the host cell causes a measurably greater expression of the reporter gene than 

15 when the DNA-binding domain and the polymerase interaction domain arc present in the 
absence of an interaction between the two fusion proteins. The cells containing the two 
hybrid proteins are incubated in/on an appropriate medium and the cells are monitored for 
the measurable activity of the gene product of the reporter construct. A positive test for this 
activity is an indication that the target protein and the sample protein have interacted. Such 

20 interaction brings their respective DNA-binding and polymerase interaction domain into 
sufficiently close proximity to cause efficient transcription of the reporter gene. 

Exemplification 

The invention, now being generally described, will be more readily understood by 
25 reference to the following examples, which arc included merely for purposes of illustration 
of certain aspects and embodiments of the present invention and are not intended to limit 
the invention. 

The C-terminal domain of the alpha subunit of RNA polymerase (a-CTD) mediates 
the effects of many transcriptional activators in bacteria, likely through direct contact. The 
30 a-CTD was replaced with the C-terminal domain of the bacteriophage X repressor, a 
domain that forms dimers and higher order oligomers. It is then demonstrated that an 
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artificial promoter bearing a single X operator in its upstream region is activated by X 
repressor in cells that express the hybrid a gene. The following examples further show that 
mutations in X repressor that weaken the CTD oligomerization interaction also decrease 
activation in the strain bearing the hybrid a gene. These findings show that the strength of 
an arbitrary protein-protein interaction determines the magnitude of gene activation. Thus, 
for at least certain promoters, recruitment of RNA polymerase to the DNA is sufficient for 
gene activation. 

RNA polymerase in £ coli consists of an enzymatic core composed of subunits a, 
p, and p* in the stoichiomelry ot 2 pp\ and one of several alternative o factors responsible, for 
specific promoter recognition. The a subunit, which initiates the assembly of RNA 
polymerase by forming a dimer, has two independently folded domains. The larger amino- 
tcrminal domain (a-NTD) mediates dimerization and the subsequent assembly of 
polymerase. The carboxy-terminal domain (ct-CTD), which is tethered to the a-NTD by a 
flexible linker region, interacts with a DNA sequence known as the "UP-element" that is 
found upstream of the -35 region of certain particularly strong promoters. The ct-CTD is 
also the target of action of a large class of transcriptional activators. 

The Cyclic AMP Receptor Protein (CRP) is the most intensively studied example of 
a transcriptional activator that exerts its effect on the a-CTD. Several lines of evidence 
indicate that CRP uses a well-defined activating region consisting of a nine amino acid 
surface-exposed loop to contact the a-CTD directly when bound to its recognition site 
(centered at postion -61.5) upstream of the familiar lac promoter. In the case of CRP as 
well as several other activators, specific amino acid residues in the a-CTD have been 
identified that arc required for activation. The available evidence suggests that activation 
by this class of activators involves direct contact with one or another target region on the a- 
CTD. However, this evidence does not establish whether the a-CTD plays some special 
role or whether any protein-protein contact would suffice. 

To address this question, the natural interaction between activator and a-CTD was 
replaced with a different interaction involving a protein domain that does not ordinarily 
mediate transcriptional activation. To do this, the well-defined properties of the C-terminal 
domain (CTD) of the bacteriophage X repressor were relied upon. 

The X repressor (Xcl) is a two-domain protein that functions as both a repressor and 
an activator of transcription. Xcl binds DNA as a dimer, and pairs of dimers bind 
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cooperatively to adjacent operator sites (Figure 1A). The N-terminal domain contacts the 
DNA and interacts with RNA polymerase when Xc\ is bound at promoter P RM . whereas the 
CTD mediates both dimer formation and the dimer-dimer interaction that results in 
coopcrativity. A large number of Xcl mutants specifically defective for cooperative 
5 binding to DNA have been isolated and these mutants bear single amino acid substitutions 
in the CTD. 

It was reasoned that if the ct-CTD was replaced with the Xcl-CTD, the resulting ct-cl 
fusion protein would display a dimeric target that could be contacted by an appropriately 
positioned Xcl dimer (Figure IB). This would test whether the same protein-protein 
10 interaction that ordinarily mediates the cooperative binding of pairs of Xcl dimers to the 
DNA would mediate transcriptional activation when the XcI-CTD is tethered to the a-NTD. 

The hybrid a gene was created by replacing the gene segment encoding the a-CTD 
with a gene segment encoding the XcI-CTD. A derivative of the lac promoter bearing a 
single X operator (0 R 2) in place of the CRP-binding site was created (centered 62 bps 
15 upstream of the transcription startpoint) (Figure IB). Ordinarily, Xcl activates transcription 
when bound at a unique position centered at position -42; as expected, therefore, Xcl does 
not activate transcription from this lac promoter derivative. 

The lac promoter derivative was introduced in single copy into the chromosome of 
£ coli strain MCI 000 FlacH. Compatible vectors driving the expression of the hybrid a 

20 gene and the cl gene were also introduced into this strain. Xcl stimulated transcription from 
the lac promoter derivative a maximum of approximately 10-fold as measured by p- 
galactosidase assays. This stimulation was observed only in the presence of the hybrid a 
gene; in its absence Xcl repressed transcription slightly. Furthermore, expression of the a- 
cl fusion protein had no significant effect on transcription from the lac promoter derivative 

25 in the absence of XcL Primer extension analysis confirmed that the stimulatory effect of Xcl 
reflected an increase in correctly initiated transcripts. 

Our hypothesis concerning the mechanism of this activation predicts that a Xc\ 
mutant unable to bind cooperatively to the DNA would be unable to activate transcription in 
this artificial system. To test this prediction an experiment was designed using the Xcl 
30 coopcrativity mutant (XcI-D197G) that is unable to bind cooperatively to both adjacent and 
separated operator sites, but is otherwise fully functional (i.e. its binding to a single operator 
site in vivo is indistinguishable from that of wild type Xcl). Unlike wild type Xcl this 
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mutant failed to activate transcription from the lac promoter derivative in the presence of 
the hybrid a gene. 

Furthermore, several Xc\ mutants with specific but less severe cooperativity defects 
were also utilized in similar experiments. Substitutions N148D and R196M weaken, but do 

5 not abolish, the dimer-dimcr interaction responsible for cooperativity. Mutant R196M is 
more defective for cooperative binding than mutant NI48D, and, like mutant D197G, both 
XcI-N148D and Xcl-R196M behave indistinguishably from wild type Jlcl in binding to a 
single operator site in vivo. The two mutants stimulated transcription from the lac promoter 
derivative more weakly than wild type Xcl, and the stronger cooperativity mutant also 

0 manifested a stronger activation defect. 

The equilibrium dissociation constant for the interaction of Xcl dimers in solution is 
about K)" 6 M, and cooperative binding to DNA likely involves this same interaction. These 
results suggest that any protein-protein interaction of comparable strength involving a 
DNA-bound protein and a protein domain tethered to the ct-NTD would bring about 
transcriptional activation. The analysis of the kc\ cooperativity mutants indicates that the 
magnitude of the activation decreases as the dimer-dimer interaction is weakened. It is not 
known what would be the effect of increasing the strength of the dimer-dimer interaction. It 
will be interesting to learn how strong an interaction would result in maximal activation. It 
is possible that a sufficiently strong interaction might impede promoter clearance and, 
therefore, result in transcriptional repression rather than activation. 

Our results indicate that a protein domain with no determinants for DNA-binding 
can mediate transcriptional activation when tethered to the a-NTD simply by providing a 
surface that can be contacted by a DNA-bound protein. The discovery of the DNA-binding 
capability of the ct-CTD suggested that activators that interact with the a-CTD might help 
stabilize its association with DNA at promoters that lack an UP element. In support of this 
idea, footprinting studies have indicated that the interaction between CRP and the a-CTD at 
the lac promoter promotes the association of the a-CTD with the DNA adjacent to the CRP- 
binding site and upstream of the promoter -35 region. This observation has prompted the 
proposal that other, and perhaps all, activators that interact with the a-CTD function by 
recruiting the a-CTD to the DNA. These findings, however, imply that activation can occur 
in the absence of this recruitment. 
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This new protein-protein contact alone suffices Tor gene activation, suggesting that a 
DNA-bound activator can recruit the holocnzymc to a promoter simply by touching an 
available target surface. These findings in £ coll imply that in prokaryotes, activation can 
be elicited by a simple protein-protein contact involving a DNA-bound activator on the one 
5 hand and an available target surface within the RNA polymerase holocnzyme on the other. 

Xcl normally activates transcription at the X P RM promoter using an activation patch 
on its N-terminal domain to contact the a subunit of RNA polymerase. This contact 
requires that Xcl be bound just upstream of the P RM -35 region at a site centered at position 
-42. An experiment was designed to ask whether Xcl bound at this position could use both 
10 its normal activation patch and its C-terminal domain to make simultaneous contacts with 
RNA polymerase in a strain expressing the a-cl fusion protein. This was found to work 
spectacularly well. Whereas Xcl normally stimulates PRM transcription by a factor of less 
than 10, an approximately 100-fold stimulation in a strain expressing the a-cl fusion was 
observed. 

15 This finding suggests that one could use this set up to detect extremely weak 

protein-protein interactions. In fact, the data with the D197G mutant shows that with this 
assay a weak residual interaction can be detected. 



20 All of the above-cited references and publications arc hereby incorporated by 

reference. 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, numerous equivalents to the specific polypeptides, nucleic acids, 
25 methods, assays and reagents described herein. Such equivalents are considered to be 
within the scope of this invention and are covered by the following claims. 
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Wc claim: 

I . A method for detecting interaction between a first test polypeptide and a second test 
polypeptide, comprising 

i. providing an interaction trap system including a prokaryotic host cell which contains 

(a) a reporter gene opcrably linked to a transcriptional regulatory sequence which 
includes a binding site ("DBD recognition element") for a DNA-binding 
domain, 

(b) a first chimeric gene which encodes a first fusion protein, said first fusion 
protein including a DNA-binding domain and first test polypeptide, 

(c) a second chimeric gene which encodes a second fusion protein including an 
activation tag activates transcription of the reporter gene when localized to the 
vicinity of the DBD recognition element, 

wherein interaction of the first fusion protein and second fusion protein in the host 
cell results in measurably greater expression of the reporter gene; 

ii. measuring expression of said reporter gene; and 

iii. comparing the level of expression of said reporter gene to a level of expression in a 
control interaction trap system in which one of both of the first and second test 
polypeptides are missing from the first and second fusion proteins and resulting 
fusion proteins do not interact, 

wherein a statistically significant increase in the level of expression is indicative of an 
interaction between the first and second test polypeptide portions of the fusion proteins, 

2. The method of claim 1, wherein the activation tag is a polymerase interaction 
domain (PID) which forms active RNA polymerase complexes in the host cell 

3. The method of claim 2, wherein the PID includes at least a portion of an RNA 
polymerase subunit. 

4. The method of claim 3, wherein the PID includes at least a portion of an a or o 
polymerase subunit. 
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5. The method of claim 1, wherein the host cell is selected from the group consisting of 
bacterial strains of Escherichia. Bacillus, Streptomyccs. Pseudomonas s Salmonella. Serratia 
and Shigella. 

6. The method of claim 1 , wherein the reporter gene encodes a gene product that gives 
rise to a detectable signal selected from the group consisting of: color, fluorescence, 
luminescence, cell viability relief of a cell nutritional requirement, cell growth, and drug 
resistance. 

7. The method of claim I, wherein the reporter gene encodes a gene product selected 
from the group consisting of chloramphenicol acetyl transferase, luciferasc, P-galactosidase 
and alkaline phosphatase. 

8. The method of claim 1, wherein at least one of the first and second test polypeptides 
are from a nucleic acid library. 

9. The method of claim 1, wherein the DNA-binding domain includes a DNA binding 
portion of a transcriptional regulatory protein. 

10. The method of claim 1, wherein the first fusion protein also includes an 
oligomerization motif. 

11. A kit for detecting interaction between a first test polypeptide and a second test 
polypeptide, the kit comprising: 

i. a first vector for encoding a first fusion protein ("bait fusion protein"), which vector 
comprises a first gene including: 

( 1 ) transcriptional and translation^ elements which direct expression in a 
prokaryotic host cell, 

(2) a DNA sequence that encodes a DNA-binding domain and which is functionally 
associated with the transcriptional and translational elements of the first gene, 
and 

(3) a means for inserting a DNA sequence encoding a first test polypeptide into the 
first vector in such a manner that the first test polypeptide is capable of being 
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expressed in-frame as part of a bait fusion protein containing the DNA binding 
domain; 

ii. a second vector for encoding a second fusion protein ("prey fusion protein"), which 
comprises a second gene including: 

( 1 ) transcriptional and translational elements which direct expression in a 
prokaryotic host cell, 

(2) a DNA sequence that encodes a polymerase interaction domain (P1D) which 
forms active RNA polymerase complexes in the prokaryotic host cell, the PID 
DNA sequence being functionally associated with the transcriptional and 
translational elements of the second gene, and 

(3) a means for inserting a DNA sequence encoding the second test polypeptide 
into the second vector in such a manner that the second test polypeptide is 
capable of being expressed in-frame as part of a prey fusion protein containing 
the polymerase interaction domain; and 

iii. a prokaryotic host cell containing a reporter gene having a binding site ("DBD 
recognition clement") for the DNA-binding domain, wherein the reporter gene 
expresses a detectable protein when a prey fusion protein interacts with a bait fusion 
protein bound to the DBD recognition element; the host cell being incapable of 
expressing any appreciable level of a protein having the function of (a) the first 
marker gene, (b) the second marker gene, (c) the DNA-binding domain, and (d) the 
polymerase interaction domain; 

wherein binding of the first test polypeptide and the second test polypeptide in the host cell 
results in measurably greater expression of the reporter gene than the simultaneous presence 
of the DNA-binding domain and the polymerase interaction domain in the absence of an 
interaction between the first test polypeptide and the second test polypeptide. 

12. The kit of claim 11, wherein the activation tag is a polymerase interaction domain 
(PID) which forms active RNA polymerase complexes in the host cell 

13. The kit of claim 12, wherein the PID includes at least a portion of an RNA 
polymerase subunit. 
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14. The kit of claim 13. wherein the PID includes at least a portion of an a or o> 
polymerase subunit. 

15. The kit of claim II, wherein the host cell is selected from the group consisting of 
5 bacterial strains of Escherichia, Bacillus, Streptomycin Pseudomanas. Salmonella. Serratia 

and Shigella. 



1 6. The kit of claim 1 1 , wherein the reporter gene encodes a gene product that gives rise 
to a detectable signal selected from the group consisting of: color, fluorescence, 
10 luminescence, cell viability relief of a cell nutritional requirement, cell growth, and drug 
resistance. 



1 7. The kit of claim 1 1 . wherein the reporter gene encodes a gene product selected from 
the group consisting of chloramphenicol acetyl transferase, lucifcrase, (i-galactosidase and 

15 alkaline phosphatase. 

18. The kit of claim 1 1, wherein at least one of the first and second test polypeptides arc 
from a nucleic acid library. 

20 19. The kit of claim 11, wherein the DNA-binding domain includes a DNA binding 
portion of a transcriptional regulatory protein. 



20. The kit of claim 1 1, wherein the first fusion protein also includes an oligomcrization 
motif. 

21. A method for isolating a nucleic acid encoding a polypeptide which a selected 
protein target, comprising 

i. providing an interaction trap system including a varcigated population of 
prokaryotic host cell which each include: 

(a) a reporter gene operably linked to a transcriptional regulatory sequence which 
includes a binding site ("DBD recognition clement") for a DNA-binding 
domain, 
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(b) a first chimeric gene which encodes a first fusion protein, said first fusion 
protein including a DNA-binding domain and first test polypeptide, 

(c) a second chimeric gene which encodes a second fusion protein including an 
activation tag activates transcription of the reporter gene when localized to the 
vicinity of the DBD recognition element, 

wherein interaction of the first fusion protein and second fusion protein in the host 
cell results in measurably greater expression of the reporter gene, and one of the first 
or second chimeric genes is present in the host cell population as a variegated 
population with respect to sequence encoding test polypeptides; 

ii. measuring expression of said reporter gene under conditions wherein a statistically 
significant increase in the level of expression of the reporter gene is indicative of an 
interaction between the first and second test polypeptide portions of the fusion 
proteins; and 

iii. selecting cells from the host cell population on the basis of the level of expression of 
said reporter gene. 

22. The method of claim 21, wherein the activation tag is a polymerase interaction 
domain (PID) which forms active RNA polymerase complexes in the host cell 

23. The method of claim 22, wherein the PID includes at least a portion of an RNA 
polymerase subunit. 

24. The method of claim 23, wherein the PID includes at least a portion of an a or co 
polymerase subunit. 

25. The method of claim 21, wherein the host cell is selected from the group consisting 
of bacterial strains of Escherichia, Bacillus, Streptomyccs, Pseudomonas, Salmonella, 
Serratia and Shigella. 

26. The method of claim 21, wherein the reporter gene encodes a gene product that 
gives rise to a detectable signal selected from the group consisting of: color, fluorescence, 
luminescence, cell viability relief of a cell nutritional requirement, cell growth, and drug 
resistance. 
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27. The method of claim 21, wherein the reporter gene encodes a gene product selected 
from the group consisting of chloramphenicol acetyl transferase, luciferasc. P-galactosidase 
and alkaline phosphatase. 

5 

28. The method of claim 21, wherein the DNA-binding domain includes a DNA binding 
portion of a transcriptional regulatory protein. 

29. The method of claim 21, wherein the first fusion protein also includes an 
] 0 oligomcrization moti f. 
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